11 research outputs found

    Resource-constrained Multi-agent Markov Decision Processes

    No full text
    Intelligent autonomous agents, designed to automate and simplify many aspects of our society, will increasingly be required to also interact with other agents autonomously. Where agents interact, they are likely to encounter resource constraints. For example, agents managing household appliances to optimize electricity usage might need to share the limited capacity of the distribution grid.This thesis describes research into new algorithms for optimizing the behavior of agents operating in constrained environments, when these agents have significant uncertainty about the effects of their actions on their state. Such systems are effectively modeled in a framework of constrained multi-agent Markov decision processes (MDPs). A single-agent MDP model captures the uncertainty in the outcome of the actions chosen by a specific agent. It does so by providing a probabilistic model of state transitions, describing the likelihood of arriving in a future state, conditional on the current state and action. Agents collect different rewards or penalties depending on the current state and chosen action, informing their objective of maximizing their expected reward. To include constraints, resource consumption functions are added to the actions, and the agents' (shared) objective is modified with a condition restricting their (cumulative) resource consumption. We propose novel algorithms to advance the state of the art in three challenging settings: computing static preallocations off-line, computing dynamic (re)allocations on-line, and optimally learning model dynamics through safe reinforcement learning under the constraints. Taken together, these algorithms show how agents can coordinate their actions under uncertainty and shared resource constraints in a broad range of conditions. Furthermore, the proposed solutions are complementary: static preallocations can be used as back-up strategy for when a communication disruption prevents the use of dynamic allocations.Algorithmic

    Project Scheduling: The Impact of Instance Structure on Heuristic Performance

    No full text
    Many meta-heuristic approaches have been suggested for or applied to the Resource Constrained Project Scheduling Problem (RCPSP). The existence of a number of highly accessible standard benchmark sets has promoted a research focus on finding anything that improves average solution quality, without investigating what effect is responsible for the improvement, or what is responsible for holding us back. This work focuses instead on understanding the original constructive Schedule Generation heuristics and their interaction with a well known but poorly understood post-processing step called Forward-Backward Improvement that is known to almost always improve any generated RCPSP schedule. We follow an empirical investigation methodology by first observing the effect of FBI on a large generated testset. Based on these observations we explain why FBI works by means of hypotheses on its operation. These hypotheses generate predictions that we subsequently successfully test in a second round of experiments. In the process we are able to propose a novel priority rule heuristic based on the principles of FBI. We find that this new rule outperforms the current best priority rule heuristic.AlgorithmicsSoftware TechnologyElectrical Engineering, Mathematics and Computer Scienc

    Decoupling a Resource Constraint through Fictitious Play in Multi-agent Sequential Decision Making

    No full text
    When multiple independent agents use a limited shared resource, they need to coordinate and thereby their planning problems become coupled. We present a resource assignment strategy that decouples agents using marginal utility cost, allowing them to plan individually. We show that agents converge to an expected cost curve by keeping a history of plans, inspired by fictitious play. This performs slightly better than a state-of-the-art best-response approach and is significantly more scalable than a preallocation Mixed-Integer Linear Programming formulation, providing a good trade-off between performance and quality.Algorithmic

    Multi-agent Planning Under Uncertainty for Capacity Management

    No full text
    Demand response refers to the concept that power consumption should aim to match supply, instead of supply following demand. It is a key technology to enable the successful transition to an electricity system that incorporates more and more intermittent and uncontrollable renewable energy sources. For instance, loads such as heat pumps or charging of electric vehicles are potentially flexible and could be shifted in time to take advantage of renewable generation. Load shifting is most effective, however, when it is performed in a coordinated fashion to avoid merely shifting the peak instead of flattening it. In this chapter, we discuss multi-agent planning algorithms for capacity management to address this issue. Our methods focus in particular on addressing the challenges that result from the need to plan ahead into the future given uncertainty in supply and demand. We demonstrate that by decoupling the interactions of agents with the constraint, the resulting algorithms are able to compute effective demand response policies for hundreds of agents.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care   Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Algorithmic

    Preallocation and Planning under Stochastic Resource Constraints

    No full text
    Resource constraints frequently complicate multi-agent planning problems. Existing algorithms for resource-constrained, multi-agent planning problems rely on the assumption that the constraints are deterministic. However, frequently resource constraints are themselves subject to uncertainty from external influences. Uncertainty about constraints is especially challenging when agents must execute in an environment where communication is unreliable, making on-line coordination difficult. In those cases, it is a significant challenge to find coordinated allocations at plan time depending on availability at run time. To address these limitations, we propose to extend algorithms for constrained multi-agent planning problems to handle stochastic resource constraints. We show how to factorize resource limit uncertainty and use this to develop novel algorithms to plan policies for stochastic constraints. We evaluate the algorithms on a search-and-rescue problem and on a power-constrained planning domain where the resource constraints are decided by nature. We show that plans taking into account all potential realizations of the constraint obtain significantly better utility than planning for the expectation, while causing fewer constraint violations.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Algorithmic

    Evaluation and Improvement of Laruelle-Widgrén Inverse Banzhaf Approximation

    No full text
    Voting is a popular way of reaching decisions in multi-agent systems. Weighted voting in particular allows different agents to have varying levels of influence on the decision taken: each agent’s vote carries a weight, and a proposal is accepted if the sum of the weights of the agents in favor of the proposal is at least equal to a given quota. Unfortunately, there is no clear and unambiguous relation between a player’s weight and the extent of her influence on the outcome of the decision making process. Different measures of ‘power’ have been proposed, such as the Banzhaf and the Shapley-Shubik indices. Here we consider the ‘inverse’ problem: given a vector of desired power indices for the players, how should we set their weights and the quota such that the players’ power in the resulting game comes as close as possible to the target vector? There has been some work on this problem, both heuristic and exact, but little is known about the approximation quality of the heuristics for this problem. The goal of this paper is to empirically evaluate the heuristic algorithm for the Inverse Banzhaf Index problem by Laruelle and Widgrén. We analyze and evaluate the intuition behind this algorithm. We found that the algorithm can not handle general inputs well, and often fails to improve inputs. It is also shown to diverge after only tens of iterations. Based on our analysis, we present three alternative extensions of the algorithm that do not alter the complexity but can result in up to a factor 6.5 improvement in solution quality.Software Computer TechnologyElectrical Engineering, Mathematics and Computer Scienc

    Constrained multiagent Markov decision processes: A taxonomy of problems and algorithms

    No full text
    In domains such as electric vehicle charging, smart distribution grids and autonomous warehouses, multiple agents share the same resources. When planning the use of these resources, agents need to deal with the uncertainty in these domains. Although several models and algorithms for such constrained multiagent planning problems under uncertainty have been proposed in the literature, it remains unclear when which algorithm can be applied. In this survey we conceptualize these domains and establish a generic problem class based on Markov decision processes. We identify and compare the conditions under which algorithms from the planning literature for problems in this class can be applied: whether constraints are soft or hard, whether agents are continuously connected, whether the domain is fully observable, whether a constraint is momentarily (instantaneous) or on a budget, and whether the constraint is on a single resource or on multiple. Further we discuss the advantages and disadvantages of these algorithms. We conclude by identifying open problems that are directly related to the conceptualized domains, as well as in adjacent research areas.Algorithmic

    Bounding the probability of resource constraint violations in multi-agent MDPs

    No full text
    Multi-agent planning problems with constraints on global resource consumption occur in several domains. Existing algorithms for solving Multi-agent Markov Decision Processes can compute policies that meet a resource constraint in expectation, but these policies provide no guarantees on the probability that a resource constraint violation will occur. We derive a method to bound constraint violation probabilities using Hoeffding's inequality. This method is applied to two existing approaches for computing policies satisfying constraints: the Constrained MDP framework and a Column Generation approach. We also introduce an algorithm to adaptively relax the bound up to a given maximum violation tolerance. Experiments on a hard toy problem show that the resulting policies outperform static optimal resource allocations to an arbitrary level. By testing the algorithms on more realistic planning domains from the literature, we demonstrate that the adaptive bound is able to efficiently trade off violation probability with expected value, outperforming state-of-the-art planners.Algorithmic

    www.website4free.co.nz

    No full text
    Report on the planning, design, development and testing of the 'content management' management website 'website4free.co.nz'. The goal of this project was to develop a system that will allow the users to register and build their own website base on a chosen template.Computer ScienceElectrical Engineering, Mathematics and Computer Scienc

    Capacity-aware Sequential Recommendations

    No full text
    Personalized recommendations are increasingly important to engage users and guide them through large systems, for example when recommending points of interest to tourists visiting a popular city. To maximize long-term user experience, the system should consider issuing recommendations sequentially, since by observing the user's response to a recommendation, the system can update its estimate of the user's (latent) interests. However, as traditional recommender systems target individuals, their effect on a collective of users can unintentionally overload capacity. Therefore, recommender systems should not only consider the users' interests, but also the effect of recommendations on the available capacity.The structure in such a constrained, multi-agent, partially observable decision problem can be exploited by a novel belief-space sampling algorithm which bounds the size of the state space by a limit on regret. By exploiting the stationary structure of the problem, our algorithm is significantly more scalable than existing approximate solvers. Moreover, by explicitly considering the information value of actions, this algorithm significantly improves the quality of recommendations over an extension of posterior sampling reinforcement learning to the constrained multi-agent case. We show how to decouple constraint satisfaction from sequential recommendation policies, resulting in algorithms which issue recommendations to thousands of agents while respecting constraints.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Algorithmic
    corecore